Skip to content

Conversation

@jserv
Copy link
Contributor

@jserv jserv commented Nov 21, 2025

The preemptive EDF (Earliest Deadline First) scheduler requires a mechanism for tasks to voluntarily yield CPU time while blocked on delays. In RISC-V M-mode, the ecall instruction provides a clean way to trigger a synchronous trap that invokes the scheduler without relying on timer interrupts.

The dispatcher function now accepts a from_timer parameter to distinguish between timer-driven preemption and ecall-driven yields. Timer interrupts increment the system tick counter and process time slices, while ecall-based yields skip tick advancement to prevent time drift.

When handling ecall from M-mode, the trap handler advances mepc past the 4-byte ecall instruction to prevent re-execution upon return. Critically, the ISR stack frame must also be updated because the ISR epilogue restores mepc from the saved frame rather than the CSR directly. Without this fix, mret would jump back to the ecall instruction causing an infinite trap loop.

The logger subsystem gains a flush mechanism with a direct_mode flag that bypasses the async queue. This ensures multi-line output like statistics reports prints in order, as printf normally enqueues to the ring buffer which the logger task drains asynchronously.

The rtsched test application validates the EDF implementation by running periodic RT tasks with different periods and deadlines, measuring execution counts, deadline misses, response times, and jitter to verify correct scheduler behavior.

Close #26


Summary by cubic

Introduce a preemptive EDF scheduler with ecall-based context switching on RISC‑V and a flushable logger for ordered output. Includes a test app that validates deadlines, response times, jitter, fairness, and non‑RT starvation.

  • New Features

    • EDF RT scheduler: picks earliest absolute deadline; deadlines reset on period expiry; non‑RT uses round‑robin.
    • RISC‑V ecall yield: trap advances mepc and ISR frame, passes/returns SP to do_trap, and calls dispatcher(0); timer ISR calls dispatcher(1) and maintains mtimecmp.
    • Preemptive plumbing: dispatcher(int from_timer); TCB.sp; hal_build_initial_frame(), hal_switch_stack(); hal_timer_irq_enable()/disable(); NOSCHED toggles MTIE and is gated by scheduler_started; direct UART in traps; default task stack is 8 KiB.
    • Logger: mo_logger_flush(), mo_logger_async_resume(), mo_logger_direct_mode(); printf/puts honor direct mode to keep multi‑line reports ordered.
    • rtsched app: 3 periodic RT tasks + background + idle; aggregates executions/misses/response/jitter and prints fairness/starvation summary.
  • Migration

    • Update dispatcher() calls to dispatcher(int from_timer) (timer: 1; ecall/yield: 0).
    • Wrap multi‑line reports with mo_logger_flush() … mo_logger_async_resume() for ordered output.
    • For non‑RISC‑V ports, implement hal_switch_stack(), hal_build_initial_frame(), and hal_timer_irq_enable()/disable().
    • Note: NOSCHED macros no longer retarget mtimecmp; they only toggle the timer interrupt bit and respect scheduler_started. Default task stack is now 8 KiB.

Written for commit 424616f. Summary will update automatically on new commits.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 8 files

Prompt for AI agents (all 3 issues)

Understand the root cause of the following 3 issues and fix them.


<file name="app/rtsched.c">

<violation number="1" location="app/rtsched.c:238">
Expected job counts use floor division, so RT statistics under-report by one when the test duration is not an exact multiple of the task period. Use a ceiling calculation to reflect the actual number of releases.</violation>
</file>

<file name="kernel/task.c">

<violation number="1" location="kernel/task.c:478">
Blocked tasks are force-selected as placeholders and immediately reclassified as RUNNING, so `mo_task_delay()` returns before the requested ticks elapse and the delay counter never reaches zero. Tasks can no longer sleep when every ready task is blocked.</violation>
</file>

<file name="arch/riscv/hal.c">

<violation number="1" location="arch/riscv/hal.c:313">
`__builtin_frame_address(0)` does not point to the ISR trap frame, so MEPC is never updated and the ecall keeps retriggering instead of yielding.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

list_node_t *any_node = list_next(kcb->tasks->head);
while (any_node && any_node != kcb->tasks->tail) {
if (any_node->data) {
kcb->task_current = any_node;
Copy link

@cubic-dev-ai cubic-dev-ai bot Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocked tasks are force-selected as placeholders and immediately reclassified as RUNNING, so mo_task_delay() returns before the requested ticks elapse and the delay counter never reaches zero. Tasks can no longer sleep when every ready task is blocked.

Prompt for AI agents
Address the following comment on kernel/task.c at line 478:

<comment>Blocked tasks are force-selected as placeholders and immediately reclassified as RUNNING, so `mo_task_delay()` returns before the requested ticks elapse and the delay counter never reaches zero. Tasks can no longer sleep when every ready task is blocked.</comment>

<file context>
@@ -443,7 +463,29 @@ uint16_t sched_select_next_task(void)
+        list_node_t *any_node = list_next(kcb-&gt;tasks-&gt;head);
+        while (any_node &amp;&amp; any_node != kcb-&gt;tasks-&gt;tail) {
+            if (any_node-&gt;data) {
+                kcb-&gt;task_current = any_node;
+                tcb_t *any_task = any_node-&gt;data;
+                return any_task-&gt;id;
</file context>
Fix with Cubic

@jserv jserv force-pushed the rtsched branch 4 times, most recently from 6c67a2d to 489485c Compare November 21, 2025 16:37
This commit introduces a preemptive Earliest Deadline First (EDF) scheduler
that uses RISC-V ecall instructions for voluntary context switches while
preserving the existing cooperative scheduling mode.

The preemptive scheduler required several architectural changes. Tasks now
maintain separate stack pointer (sp) fields for ISR-based context switching,
distinct from the jmp_buf context used in cooperative mode. The dispatcher
accepts a from_timer parameter to distinguish timer-driven preemption from
voluntary yields, ensuring tick counters only increment on actual timer
interrupts.

Context switching in preemptive mode builds ISR stack frames with mepc
pointing to task entry points, allowing mret to resume execution. The ecall
handler invokes the dispatcher directly, enabling tasks to yield without
relying on setjmp/longjmp which are incompatible with interrupt contexts.

The cooperative mode preserves its setjmp/longjmp semantics. The dispatcher
always calls hal_context_restore() even when the same task continues,
because the longjmp completes the save/restore cycle initiated by
hal_context_save(). The hal_interrupt_tick() function enables interrupts
on a task's first run by detecting when the entry point still resides in
the context's return address slot.

Real-time scheduling support includes EDF with deadline-based priority
calculation, configurable through mo_task_rt_priority(). The RT scheduler
hook in KCB allows custom scheduling policies. Delay handling was enhanced
with batch updates to minimize critical section duration.

The logger subsystem gained a direct_mode flag for ISR-safe output, and
printf was made flush-aware to support synchronous output when needed.
Exception handling uses trap_puts() to avoid printf deadlock in trap
context.

Close #26
@github-actions
Copy link

Linmo CI Test Results

Overall Status: ✅ passed
Timestamp: 2025-11-21T16:59:30+00:00
Commit: 8447e2c

Toolchain Results

Toolchain Build Crash Test Functional
GNU ✅ passed ✅ passed ✅ passed
LLVM ✅ passed ⏭️ skipped ⏭️ skipped

Application Tests

App GNU LLVM
cond ✅ passed ⏭️ skipped
coop ✅ passed ⏭️ skipped
cpubench ✅ passed ⏭️ skipped
echo ✅ passed ⏭️ skipped
hello ✅ passed ⏭️ skipped
mqueues ✅ passed ⏭️ skipped
mutex ✅ passed ⏭️ skipped
pipes ✅ passed ⏭️ skipped
pipes_small ✅ passed ⏭️ skipped
pipes_struct ✅ passed ⏭️ skipped
prodcons ✅ passed ⏭️ skipped
progress ✅ passed ⏭️ skipped
rtsched ✅ passed ⏭️ skipped
semaphore ✅ passed ⏭️ skipped
suspend ✅ passed ⏭️ skipped
test64 ✅ passed ⏭️ skipped
test_libc ✅ passed ⏭️ skipped
timer ✅ passed ⏭️ skipped
timer_kill ✅ passed ⏭️ skipped

Functional Test Details

Test GNU LLVM
mutex:data_consistency ✅ passed ⏭️ skipped
mutex:fairness ✅ passed ⏭️ skipped
mutex:mutual_exclusion ✅ passed ⏭️ skipped
mutex:overall ✅ passed ⏭️ skipped
semaphore:overall ✅ passed ⏭️ skipped

Report generated from test-summary.toml

@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Nov 21, 2025
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Nov 21, 2025
cubic-dev-ai[bot]

This comment was marked as resolved.

@jserv jserv merged commit 0b74337 into main Nov 21, 2025
7 checks passed
@jserv jserv deleted the rtsched branch November 21, 2025 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RT Scheduler Fairness Failure

2 participants